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Abstract — This paper explores optimization of paging and 
registration policies in cellular networks. Motion is modeled as 
a discrete-time Markov process, and minimization of the dis- 
counted, infinite-horizon average cost is addressed. The structure 
of jointly optimal paging and registration policies is investigated 
through the use of dynamic programming for partially observed 
Markov processes. It is shown that there exist policies with a 
certain simple form that are jointly optimal, though the dynamic 
programming approach does not directly provide an efficient 
method to find the policies. 

An iterative algorithm for policies with the simple form is 
proposed and investigated. The algorithm alternates between 
paging policy optimization and registration policy optimization. 
It finds a pair of individually optimal policies, but an example 
is given showing that the policies need not be jointly optimal. 
Majorization theory and Riesz's rearrangement inequality are 
used to show that jointly optimal paging and registration policies 
are given for symmetric or Gaussian random walk models by 
the nearest-location-first paging policy and distance threshold 
registration policies. 

Index Terms — Paging, registration, cellular networks, partially 
observed Markov processes, majorization, rearrangement theory 

I. Introduction 

The growing demand for personal communication services 
is increasing the need for efficient utilization of the limited 
1 resources available for wireless communication. In order to 
deliver service to a mobile station (MS), the cellular network 
must be able to track the MS as it roams. In this paper, the 
problem of minimizing the cost of tracking is discussed. Two 
basic operations involved in tracking the MS are paging and 
registration. 

There is a tradeoff between the paging and registration 
costs. If the MS registers its location within the cellular 
network more often, the paging costs are reduced, but the 
registration costs are higher The traditional approach to paging 
and registration in cellular systems uses registration areas 
which are groups of cells. An MS registers if and only if it 
changes registration area. Thus, when there is an incoming call 
directed to the MS, all the cells within its current registration 
area are paged. Another method uses reporting centers [3]. An 
MS registers only when it enters the cells of reporting centers, 
while every search for the MS is restricted to the vicinity of 
the reporting center to which it last reported. 
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Fig. 1 . Paging policy and registration policy generation 

Some dynamic registration schemes are examined in [4] : 
time-based, movement-based, and distance-based. These poli- 
cies are threshold policies and the thresholds depend on the 
MS motion activities. In [14], dynamic programming is used to 
determine an optimal state-based registration poUcy. Work in 
[2] considers congestion among paging requests for multiple 
MSs, and considers overlapping registration regions. 

Basic paging policies can be classified as follows: 

• Serial Paging. The cellular network pages the MS se- 
quentially, one cell at a time. 

> Parallel Paging. The cellular network pages the MS in a 
collection of cells simultaneously. 
Serial paging policies have lower paging costs than parallel 
paging policies, but at the expense of larger delay. The method 
of parallel paging is to partition the cells in a service region 
into a series of indexed groups referred to as paging areas. 
When a call arrives for the MS, the cells in the first paging 
area are paged simultaneously in the first round and then, if the 
MS is not found in the first round of paging, all the cells in the 
second paging area are paged, and so on. Given disjoint paging 
areas, searching them in the order of decreasing probabilities 
minimizes the the expected number of searches [19]. This 
paging order is denoted as the maximum-likelihood serial 
paging order. An interesting topic of paging is to design 
the optimal paging areas within delay constraints [12, 19,22]. 
However, in this paper, we consider only serial paging polices. 

Each paper mentioned above assumes a certain class of 
paging or registration policy. Given one policy (paging policy 
or registration policy) and the parameters of an assumed 
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motion model, the counterpart policy (registration policy or 
paging policy, respectively) is found. For instance, the optimal 
paging policy is identified in [19] for a given registration 
poUcy. This is shown as the top branch of Figure [T] Conversely, 
an expanding "ping-pong" order paging policy suited to the 
given motion model is assumed in [14]. With this knowledge, 
dynamic programming is applied to solve for the optimal 
registration policy. This corresponds to the bottom branch of 
Figure [T] 

Several studies have addressed minimizing the costs, con- 
sidering the paging and registration policies together [1,20, 
21]. In [20], a timer-based registration policy combined with 
maximum-likelihood serial paging is introduced. The mini- 
mum paging cost can be represented by the distribution of 
locations where the MS last reported. Then an optimal timer 
threshold is selected to minimize the total cost of registration 
and paging. By contrast, a movement-based registration policy 
is used in [1]. An improvement of [20] is given in [21] by 
assuming that the MS knows not only the current time, but 
also its own state and the conditional distribution of its state 
given the last report. This is a state-based registration policy 
and is aimed to minimize the total costs by running a greedy 
algorithm on the potential costs. Although the papers discuss 
the paging and registration policy together, they don't consider 
jointly optimizing the policies. 

The contributions of this paper are as followsQThe structure 
of jointly optimal paging and registration policies is identified. 
It is shown that the conditional probability distribution of 
the states of an MS can be viewed as a controlled Markov 
process, controlled by both the paging and registration polices 
at each time. Dynamic programming is applied to show that 
the jointly optimal policies can be represented compactly by 
certain reduced complexity laws (RCLs). An iterative algo- 
rithm producing a pair of RCLs is proposed based on closing 
the loop in Figure[T] The algorithm is a heuristic which merges 
the approaches in [14] and [19]. Several examples are given. 
The first example is an illustration of numerical computation 
of an individually optimal policy pair The second example is 
a simple one illustrating that individually optimal policies are 
not necessarily jointly optimal. Finally, three more examples 
are given based on random walk models of motion: one- 
dimensional discrete state symmetric, multidimensional sym- 
metric, and multidimensional Gaussian. Majorization theory 
and Riesz's rearrangement inequality are used to show that 
jointly optimal paging and registration policies are given for 
these random walk models by the nearest-location-first paging 
policy and distance threshold registration poUcies. 

The paper is organized as follows. Notation and cost func- 
tions are introduced in Section HI] Jointly optimal policies are 
investigated in Section |III] The iterative optimization formula 
for computing individually optimal policy pairs is developed 
in Section |IV] The first two examples are given in Section 
FVl and the random walk examples are given in Section |VT] 
Conclusions are given in Section IVIII 



Earlier versions of tliis work appeared in [10, 11]. 



II. NETWORK MODEL 

A. State description and cost 

Let C denote the set of cells, which is assumed to be finite. 
A cell c is a physical location that the MS can physically be in. 
The motion of an MS is modeled by a discrete-time Markov 
process {X{t) : t > 0) with a finite state space S, one-step 
transition probability matrix P — {pij : i,j 6 S), and given 
initial state xq. A state j G S determines the cell c that the 
MS is physically located in, and it may indicate additional 
information, such as the current velocity of the MS, or the 
previously visited cell. Thus a cell c can be considered to be a 
set of one or more states, and the set C of all cells is a partition 
of S. It is assumed that the network knows the initial state xq. 
In the special case that there is one state per cell, we write 
C — S, and then the MS moves among the cells according to 
a Markov process. 

The possible events at a particular integer time instant t > 1 
are as follows, listed in the order that they can occur First, the 
state X{t) is generated based on X{t — 1) and the one-step 
transition probability matrix P. Then, it is determined whether 
the MS is to be paged, and the answer is "yes" with probability 
Xp, independently of the state of the MS and all past events. 
The cost of the paging at time t is VNt, where V is the cost 
of searching one cell and Nt is the number of cells that are 
searched until the MS is found. If the MS is paged, the cellular 
network learns the state, X{t). Let = if the MS is not 
paged at time t. Finally, if the MS was not paged, the MS 
decides whether to register The cost of registration is TZ and 
the benefit of registration is that the cellular network learns 
the state of the MS. No paging or registration is considered 
for t = 0. 

Let Pt denote the event that the MS is paged at time t, and 
let Rt denote the event that the MS registers at time t. We say 
that a report occurs at time t if either a paging or a registration 
occurs, because in either case, the cellular network learns the 
state of the MS. For any set A, let I a denote the indicator 
function of A, which is one on A and zero on the complement 
A'^. Discrete probability distributions are considered to be row 
vectors. Given a state I G S, let 5{l) denote the probability 
distribution on S which assigns probability one to state /. Thus, 
5i{l) — See the appendix for a review of the notions 

of CT-algebras used in this paper 

B. Paging policy notation 

For simplicity we consider only serial paging policies, so 
that cells are searched one at a time until the MS is located. It 
is also assumed that if the MS is present in the cell in which it 
is paged, it responds to the page successfully. In other words, 
no paging failure is allowed. It is further assumed that the 
time it takes to issue a single-cell page is negligible compared 
to one time step of the MS's motion model, so that paging is 
always successfully completed within one time step. 

Let Aft denote the a-algebra representing the information 
available to the network by time t after the paging and 
registration decisions have been made and carried out. Thus, 
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for t > 0, 



a{{Ip^,Ns,lR^:l<s<t), 
{X{s) : 1 < s < t and Ip^ur^ = 1)) 



The initial state a;o is treated as a constant, so even though it 
is known to the network it is not included in the definition of 
Aft. Note that the initial cr-algebra A/q is the trivial fx-algebra: 

Mo = {9,n}. 

When the MS is to be paged, the cells are to be searched 
sequentially according to a permutation a of the cells. The 
associated paging order vector r = (vj : j £ S) is such that 
for each state j, Vj is the number of cells that must be paged 
until the cell for state j comes up, and the MS is reached. For 
example, suppose S = {1,2,3,4,5,6} and C = {ci,C2,C3} 
with ci = {1,2}, C2 = {3,4}, and C3 = {5,6}. Then if the 
cells are search according to the permutation a = (02,01,03), 
meaning to search cell C2 first, ci second, and C3 third, then the 
paging order vector is r = (2, 2, 1, 1, 3, 3). A paging policy u 
is a collection u = {u{t) : t > 1) such that for each t > 1, 
u{t) is an Nt-i measurable random variable with values in 
the set of paging order vectors. Note that TV^ = IptUx(t){t)- 

C. Registration policy notation 

Let Ait denote the cr-algebra representing the information 
available to the MS by time t, after the paging and registration 
decisions for time t have been made and carried out. Thus, 

Mt = a{X{s),Ip^,N,,In^ :l<s<t). 

The MS also knows the initial position xq, which is treated as 
a constant. In practice an MS wouldn't learn A^,, the number 
of pages used to find the MS at time s. While we assume 
such information is available to the MS, we will see that 
optimal policies need not make use of the information. With 
this definition, we have N't C Ait, meaning that the MS knows 
everything the network knows (and typically more). 

When the MS has to decide whether to register at time t, it 
already has the information Ait-i- In addition it knows X{t) 
and . If the MS is paged at time t, then the network learns 
the state of the MS as a result, so there is no advantage for 
the MS to register at time t. Thus, we assume without loss of 
generaUty that the MS does not register at time t if it is paged 
at time t. This leads to the following definition. 

A registration policy w is a collection v = {v{t) : t > 1) 
such that for each t>l, v{t) is an Ait-i measurable random 
vector with values in [0, 1]'^ with the following interpretation. 
Given the information Ait-i, if X{t) = I and if the MS is not 
paged at time t, then the MS registers with probability vi{t). 

D. Cost function 

Let (3 be a number with < /3 < 1, called the discount 
factor An interpretation of (3 is that 1/(1 —/3) is the rough time 
horizon of interest. Given a paging policy u and registration 
policy V, the expected infinite horizon discounted cost C(u, v) 
is defined as 



C{u,v) = E 



Y,P'{VIp,Nt+niR,} 



.t=i 



The pair {u,v) is jointly optimal if C{u,v) < C{u',v') for 
every paging pohcy u' and registration pohcy v'. 

III. JOINTLY OPTIMAL POLICIES 

This section investigates the structure of jointly optimal 

policies by using the theory of dynamic programming for 
Markov control problems with partially observed states. While 
the structure results do not directly yield a computationally 
feasible solution, they shed light on the nature of the problem. 
In particular it is found that there are jointly optimal policies 
{u,v) such that, for each t, u{t) and v{t) are functions of 
the amount of time elapsed since the last report and the last 
reported state. 

Intuitively, on one hand, the paging poUcies are selected 

based on the past of the registration policy, because the past 
of the registration policy influences the conditional distribution 
of the MS state. On the other hand, by the nature of dynamic 
programming, the optimal choice of registration policy at a 
given time depends on future costs, which are determined 
by the future of the registration policy. To break this cycle, 
we consider the problem entirely from the viewpoint of the 
network. In order that current decisions not depend on past 
actions, the state space is augmented by the conditional distri- 
bution of the state of the MS given the information available 
to the network. 

A. Evolution of conditional distributions 

For t > 0, let w{t) be the conditional probability distribution 
of X(t), given the observations available to the network up to 
time t (including the outcomes of a report at time t, if there 
was any). That is, Wj{t) = P[X{t) = j\Aft] for j e S. Note 
that, with probabiUty one, w{t) is a probabiUty distribution on 
S. Intuitively, the network can control the distribution valued 
process {w{t)) by dictating the registration policy of the MS. 
Since A/q is the trivial cr-algebra and ^^(0) = xq, the initial 
conditional distribution is given by w{0) = S{xo). 

While the network may not know the recent past trajectory 
of the state process, it can still estimate the registration prob- 
abilities used by the MS. In particular, as shown in the next 
lemma, the estimate Vj{t), defined by Vj{t) = E[vj{t)\X{t) — 
j,Aft-i], plays a role in how the network can recursively 
update the w{tys. In more conventional notation, we have 



%(i) = 



E[vjit)Iix(t)=j}\Aft 



t-i 



P[X{t) = j\Aft 



t-ij 



Define a function $ as follows. Let w be a probability 
distribution on S and let d G [0, 1]*^. Let $(w, d) denote the 
probability distribution on S defined by 



^i{w,d) = 



(1) 



T,i'esT,jes'^jPji'i^-'ii')' 

^{iu,d) is undefined if the denominator in this definition is 
zero. The meaning of $ is that if at time t the network knows 
that X{t) has distribution w, if no paging occurs at time t+1, 
and if the MS registers at time t + 1 with probabiUty dx{t+i), 
then ^{iu,d) is the conditional distribution of X{t+1) given 
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no registration occurs at time t + 1. This interpretation is made 
precise in the next lemma. The proof is in the appendix. 

Lemma 3.1: The following holds, under the paging and 
registration policies u and v. 



w{t + l) = SiX{t+l))Ip^^,uB,^, 



(2) 



B. New state process 

For t > 1, let e{t) = {w{t), Ip^ , Ntjp,). Note that the tth 
term in the cost function is a function of 8(t). Note also that 
Q{t) is measurable with respect to A/j, so that the network can 
calculate 8(t) at time t (after possible paging and registration). 
Moreover, the first coordinate of Q{t), namely w{t), can be 
updated with increasing t with the help of Lemma 13.11 The 
random process {Q{t) : t > 0) can be viewed as a controlled 
Markov process, adapted to the family of (T-algebras {Aft : t > 
0) with controls {u{t),v{t) : t > 1). Note that u{t + 1) and 
v{t+ 1) are each Aft measurable for each t > 0. The one-step 
transition probabilities for (0(i)) are given as follows. (The 
variables j and I range over the set of states S.) 



Observe that although the MS uses a registration policy 
V, the one-step transition probabilities for O depend only on 
V. Moreover, v is itself a registration policy. Indeed, since 
TVt-i C A4t-i, v{t) is A4t-i measurable, and it takes values 
in [0, 1]'^. If V were used instead of u as a registration policy 
by the MS, the one-step transition probabilities for 9 would 
be unchanged. Thus, the policy v is adapted to the family of 
a algebras {Aft : t > 0), and it yields the same cost as v. 
Therefore, without loss of generality, we can restrict attention 
to registration policies v that are adapted to {Aft ■ t > 0). 

Combining the observations summarized in this section, we 
arrive at the following proposition. 

Proposition 3.1: The original joint optimization problem is 
equivalent to a Markov optimal control problem with state 
process {Q{t) : t > 0) adapted to the family of cr-algebras 
{Aft.t> 0), with controls {u{t),v{t) ■.t>l). 

C. Dynamic programming equations 

Above it was assumed that w{Q) — 5{xo), where xo is 
the initial state of the MS, assumed known by the network. 
In order to apply the dynamic programming technique, in 
this section the initial distribution w(0) is allowed to be any 
probability distribution on S. It is assumed that the network 
knows w(0) at time zero, and that the initial state of the MS is 
random, with distribution w(0). The evolution of the system 
as described in the previous section is well defined for an 
arbitrary initial distribution w(0). Let denote conditional 
expectation in case the initial distribution w(0) is taken to 



be w. The initial cr-algebra Aq is still the trivial cr-algebra, 
because w{0) is treated as a given constant. 
Define the cost with n steps to go as 

n 

U^{w)= min [V {VIp, Nt + TZIp, }] 

u.v ^ — ^ 

t=l 

Next apply the backwards solution method of dynamic pro- 
gramming, by separating out the t — 1 term in the cost for 
n + 1 steps to go. This yields 



Un+l{w) 



min /3 

u,v 



j I 



n+1 



e{t + i) 


probability 


{S(l),l,ui{t +1),0) 






(5(0,0,0,1) 


i'i- - K)T,jWj{t)pjlVl 




($(«;(*), ?T(f + l)), 0,0,0) 


(i-Ap)E,«'jWp.i(i- 


vi{t + l)) 



t=2 

Note that u{l) and v{\) are both measurable with respect 
to the trivial cr-algebra TVq. Therefore these controls are 
constants. Henceforth we write d for the registration decision 
vector v{l). The vector d ranges over the space [0, 1]^. 

The first sum in the expression for Un+i{'w) involves the 
control policies only through the choice of the paging order 
vector u{l). This sum is simply the mean number of single- 
cell pages required to find the MS given that the state of the 
MS has distribution given by the product wP, where P is the 
matrix of state transition probabilities. It is well known that 
the optimal search order is to first search the cell with the 
largest probability, then search the cell with the second largest 
probability, and so on [19]. Ties can be broken arbitrarily. The 
first sum in the expression for Un+i{w) can thus be replaced 
by s{wP), where s{q) denotes the mean number of single cell 
pages required to find the MS given that the state of the MS 
has distribution q and the optimal paging policy is used. (We 
remark that Massey [17] explored comparisons between s{w), 
in the case of one state per cell, and the ordinary entropy, 
H{w) = ^jUiilogw,;. The measure s{w) was called the 
guessing entropy in [8], and work continues to compare it 
to other forms of entropy [9].) 

The dynamic programming equation thus becomes 

Un+i{w) ^ (3Xprs{wP) 



min f3 

d 



^^^iPjl {KUn{S{l)) 
j I 

+{i-Xp)di{n + Un{5{i)))} 



(3) 



Un{<f{w,d)) 



Formally we denote this equation as Un+i = T ([/„). By a 
standard argument for dynamic programming with discounted 
cost, T has the following contraction property: 



sup \T {U) ~T{U')\ < /3sup|C/- [/'I 



(4) 



for any bounded, measurable functions U and [/', defined on 
the space of all probability distributions w on S. Consequently 
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[5,6], there exists a unique such that T{U^,) = C/,, and 
Un U uniformly as n cx). Moreover, [/* is the minimum 
possible cost, and a jointly optimal pair of paging and registra- 
tion poUcies is given by a pair (/, g) of state feedback controls, 
for the state process (ii'(t)). A jointly optimal control is given 
by u{t) = f{w{t — 1)) and v{t) = g{w{t — 1)), where / and 
g are determined as follows. For any probability distribution 
w on S, f{w) is the paging order vector for paging the cells 
in order of decreasing probability under distribution wP, and 
g{w) is a value of d that achieves the minimum in the right 
hand side of (O with J7„ replaced by Then if there is no 
report at time t+1, the conditional distribution w{t) is updated 
simply by: 

w{t+l) = ^w{t),g{w{t))) (5) 

Clearly under such stationary state feedback control laws 
{f,g), the process {w{t) : t > 0) is a time-homogeneous 
Markov process. Note that the optimal mapping / does not 
depend on g. 

Lemma 3.2: The registration policy g can be taken to be 
{0, 1}"^ valued (rather than [0, 1]'^ valued) without loss of 
optimality. 

Proof: It is first proved that Un is concave for any given 
71 > 0. Suppose wi and W2 are two probability distributions 
on S, suppose < 77 < 1 and suppose w = rjwi + (1 — 
ri)w2- Then Un{w) can be viewed as the cost to go given the 
MS has distribution wi with probability r] and distribution W2 
with probability 1 — 77, and the network does not know which 
distribution is used. The sum rjUniwi) + (1 — rf)Un{w2) has 
a similar interpretation, except the network does know which 
distribution is used. Thus, the sum is less than or equal to 
Un{w), SO that Un is concave. Therefore [/, is also concave. 

Given a function H defined on the space of aU probability 
distributions for 5", let H be an extension of H defined on 
the positive quadrant i?^ as follows. For any probability 
distribution w and any constant c > 0, H{cw) = cH{w). 
It is easy to show that if H is concave then the extension H 
is also concave. With this notation, the dynamic programming 
equation for [/, can be written as: 



min /3 

d 



^^WjPji {\pU^{5(l)) 



+ {l-\p)di{Jl + UM^))} 
+ U*{wPdiag{l - d)) . 

where diag{l — d) is the diagonal matrix with /th entry 1 — di. 
The expression to be minimized over d in this equation is a 
concave function of d, and hence the minimum of the function 
occurs at one of the extreme points of [0, 1]"^, which are just 
the binary vectors {O,!}'^. The minimizing d is g{w). This 
completes the proof of the lemma. ■ 

D. Reduced complexity laws 

Given a pair of feedback controls {f,g), a more compact 
representation of the controls is possible. Indeed, suppose the 
controls are used, and suppose in addition that X{0) = xq. 




Time since last report k 



path of MS 



Fig. 2. Example of a registration policy represented by an RCL for a thi'ee- 
state Markov chain 



where xq is an initial state known to the network. Given t > 1, 
define A: > 1 and io G S* as follows. If there was a report 
before time t, let t — k be the time of the last report before 
t. If there was no report before time < let fc = t. In either 
case, let — X{t — k). Since the network knows X{t — k) at 
time t — k (after possible paging or registration), we have that 
w{t — fc) = 5{io). Since there were no state updates during 
the times t ~ k + 1, . . . ,t — 1, it follows that w{t — 1) is the 
result of applying the update (|5]l fc — 1 times, beginning with 
(5(io). Hence, w{t — 1) is a function of io, k. Moreover, since 
u{t) = f{w{t - 1)) and v{t) = g{w{t - 1)), it follows that 
both the paging order vector u{t) and the registration decision 
vector v{t) are determined by io and fc. Let / and g denote the 
mappings such that u{t) = /{ia, fc) and v{t) — g{io, fc). Note 
that /(iojfc) is a paging order vector and g{io,k) e {0,1}'^ 
for each io, fc. We call the mappings /, g reduced complexity 
laws (RCLs). We have shown the following proposition. 

Proposition 3.2: There is no loss in optimality for the 
original joint paging and registration problem to use policies 
based on RCLs. 

Figure |2] shows an example of a registration RCL g for a 
three-state Markov chain. The augmented state of a MS is a 
triple {io,k,j), such that io is the state at the time of the 
last report, fc is the elapsed time since the last report, and j 
is the current state. Augmented states marked with an "x" 
are those for which gj{iQ,k) = 1, meaning that registration 
occurs (if paging doesn't occur first). An MS traverses a path 
from left to right until either it is paged, or until it hits a 
state marked with an "x," at which time its augmented state 
instantaneously jumps. The figure shows the path of a MS 
that began in augmented state (io, fc, j) = (2, 0, 2). At relative 
time fc = 5 the MS entered state 3, hitting an "x", causing 
the extended state to instantly change to (3, 0, 3). Three time 
units after that, upon entering state 1, the MS is paged. This 
causes the augmented state to instantly jump to (1, 0, 1). 
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IV. ITERATIVE ALGORITHM FOR FINDING 
INDIVIDUALLY OPTIMAL POLICIES 

A. Overview of iterative optimization formulation 

While jointly optimal policies can be efficiently represented 
by RCLs / and g, the dynamic programming method described 
for finding the optimal policies is far from computationally 
feasible, even for small state spaces, because functions of 
distributions on the state space must be considered. In this 
section we explore the following method for finding a pair of 
policies with a certain local optimality property. First it is show 
how to find, for a given paging RCL /, an optimal registration 
RCL g. Then it is shown how to find, for a given registration 
RCL g, an optimal paging RCL /. Iterating between these 
two optimization problems produces a pair of RCLs (/, g) 
such that for each RCL fixed, the other is optimal. Such pairs 
of RCLs are said to be individually optimal. 

In this section we impose the constraint that an MS must 
register if fc > kmax, for some large integer constant k„iax- 
With this constraint, the sets of possible registration and 
paging RCLs are finite, and numerical computation is feasible 
for fairly large state spaces. The initial state xo is assumed 
to be known and we write C(/, g) for the averaged infinite 
horizon, discounted cost, for paging RCL / and registration 
RCL g. 



B. Optimal registration RCL for given paging RCL 

Suppose a paging RCL / is fixed. In this subsection we 
address the problem of finding a registration RCL g that 
minimizes C(/, g) with respect to g. Dynamic programming 
is again used, but here the viewpoint of the MS is taken. The 
states used for dynamic programming in this section are the 
augmented states of the form (ig, fc, j), rather than the set of 
all probability distributions on S. 

Since time is implicitly included in the variable fc in the 
augmented state, it is computationally more efficient to con- 
sider dynamic programming iterations based on cycles rather 
than on single time steps, where each cycle ends when there 
is a report. Let be the time of the m*'* report. Replacing 
the infinite horizon by time horizon t„j reduces C(/, g) to 



E 



(6) 



Letting m — * oo in (|6]l yields C{f,g). 

Then for each (io, fc), write Vm{io, k,j) for the cost-to-go 
for m > 1 update cycles: 



Vm{io,k,j) = mini; 



^/3* {VIp,Nt+TZlR,} 



.t=i 



(7) 



where the expectation E is taken assuming that (a) the paging 
RCL / is used for the paging policy, (b) at t = the MS is 
in state j, and (c) the last report occurred fc time units earlier 
in state zq. Also, define Vo(jo,fc,i) = 0' because the cost is 
zero when there are no report cycles to go. 



The dynamic programming optimality equations are given 

by 

V^{i0, fc, j) = PY^Pi' [^p(^/K*0, fc + 1) + Vrn-l{l, 0, 0) 

+ {1 ~ Xp)mm{V^{io,k + lJ),n + V^-i{l,OJ)}] (8) 

As mentioned earlier, registration is forced at relative time fc = 
kmax + i for some large but fixed value kmax- Therefore we set 
Vm{io, kmax + 1,1) ^ cG and use dH) only for < fc < fc,„ax- 
These equations represent the basic dynamic programming 
optimality relations. For each possible next state, the MS 
chooses whichever action has lesser cost: either continuing 
the current registration cycle or registering for cost TZ. 

Equation (HJ can be used to compute the functions V„i 
sequentially in m as follows. The initial conditions are Vq = 
0. Once Vm-i is computed, the values Vm{io,k, j) can be 
computed using dHJ, sequentially for fc decreasing from fcmax 
to 0. Formally we denote this computation as Vm — T{Vm-i)- 
The mapping T is a contraction with constant /3 in the 
sup norm, so that Vm converges uniformly to a function 
satisfying the limiting form of (O: 

K (zo, fc, j) = /3 E (jo, fc + 1) + K 0, 0) 

+ (l-Ap)min{K(io,fc + l,0,7e + K(^0,/)}] (9) 

for < fc < kmax, and V;(io, fcmax + 1,0 = oo. The 
corresponding optimal registration RCL g* is given by 

0, if V,{io,k + l,l) <n + V,{l, 0,1) 

1, else. 

(10) 

for io G 5* and 1 < fc < kmax- 

Thus, for a given paging RCL /, we have identified how to 
compute a registration RCL g to minimize C{f,g). 



C. Optimal paging RCL for given registration RCL 

Suppose a registration RCL g is fixed. In this subsection we 
address the problem of finding a paging RCL / to minimize 
C{f,g). For io G S and < fc < kmax, let w{itj,k) denote 
the conditional probability distribution of the state of the MS, 
given that the most recent report occurred fc time units earlier 
and the state at the time of the most recent report was iq. Thus, 
w{io,0) = 5{io), and for larger fc the w's can be computed 
by the recursion: 

w{io,k + 1) = $(w(io, k),g{io,k)) 

The paging order vector f{ia, fc) is simply the one to be used 
when the MS must be paged fc time units after the previous 
report. At such time the conditional distribution of the state of 
the MS given the observations of the base station is w{io,k — 
1)P. Thus, the probability the MS is located in cell c, just 
before the paging begins is given by 



5r(«o,fc) 



p{c\io,k) = ^^Wj{io,k 
jes lec 

Finally, f{io,k) is the paging order vector for ordering the 
cells c according to decreasing values of the probabilities 

p{c\io,k). 



1)P. 
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Fig. 3. Rectangular grid motion model 

D. Iterative optimization algorithm 

In the previous subsections we described how to find an 
optimal g for given / and vice versa. This suggests an iterative 
method for finding an individually optimal pair (/,<?). The 
method works as follows. Fix an arbitrary registration RCL 
g^. Then execute the following steps. 

• Find a paging RCL to minimize C{f'^,g^) 

» Find a registration RCL gi to minimize C{f'^,g^), 

• Find a paging RCL to minimize C{f^,g^), and so on. 

Then > C{f.9') > C{!\g') > C{f\g^) >■■■ 

Since there are only finitely many RCLs, it must be that for 
some integer d, C{f'^,g'^) — C {f^ , g"^^^) . By construction, 
the paging RCL is optimal given the registration RCL 
g'^. Similarly, .g'^^^ is optimal given However, since 
C{f'^,g'^) = C(/'^,.g''+^), it follows that g'^ is also optimal 
given the registration RCL J*^. Therefore, {f'^,g'^) is an 
individually optimal pair of RCLs. 

V. EXAMPLES 

Two examples are given in this section. Additional examples 
based on random walk models are in the next section. 

A. Rectangular grid example 

Consider a rectangular grid topology, such that each cell has 
four neighbors. The diagram to the left in Figure |3] shows the 
finite imax x jmax rectangular grid topology. To provide the 
full complement of four neighbors to cells on the edges of the 
grid, the region is wrapped into a torus. The torus can serve to 
approximate larger sets of cells. Also, by the symmetry of the 
torus, the functions /(io, k), g{io, k) and distributions w(io, k) 
need be computed for only one value of last reported cell iq. 
Each cell in Figure |3] is represented by the index pair 
where i — 0, 1, ... , imax — 1 is the index for the horizontal 
axis, and j — . . . ,jrnax — 1 is the index for the vertical 
axis. 

For simplicity, we assume that there is only one state per 
cell, so we can take C = S. For a numerical example, consider 
a 15 X 15 torus grid with motion parameters psty = 0.4, pu = 
Pd ^ Pi ~ 0.1, Pr — 0.3, .To — (5,5) and other parameters 
\p = 0.03, V = l,n = 0.6, P = 0.9, and k^ax = 200. 
We numerically calculated an individually optimal pair (/, g) 
of RCLs. A sample path of X and w generated using those 
controls is indicated in Figure |4] The figure shows for selected 
times t the state X{t), indicated by a small black square, and 



t=0 t.5 




Fig. 4. Evolution of the state X{t) and the conditional distribution of the 
state w{t) for the rectangular grid example. 




Fig. 5. Simple example 

the conditional state distribution w{t), indicated as a moving 
bubble. The distribution w{t) collapses to a single unit mass 
point at i = 9 due to a page and at i = 27 due to a registration. 
Roughly speaking, the MS registers when it is not where the 
network expects it to be, given the last report received by the 
network. For instance, at time t = 26 the MS is located at 
the tail edge of the bubble, so the network has low accuracy 
in guessing the MS location. One time unit later, at t=27, the 
MS finds itself so far from where the network thinks it should 
be that the MS registers. 

B. Simple Example 

The following is an example of a small network for which 
jointly optimal paging and registration policies can be com- 
puted. The example also affords a pair of individually optimal 
RCLs which are not jointly optimal. The space structure of 
the example is shown in Figure |5] S = {0,1,2,3,4} and 
C = {co,Ci,C2} with Co - {0}, ci = {1,2}, C2 = {3,4}. 
From state 0, the MS transits to state 1 with probability 0.4 and 
to state 3 with probability 0.6. The other possible transitions 
shown in the figure have probability 1 . The initial state is taken 
to be 0. 
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Fig. 6. Evolution of w{t) for the simple example. 



TABLE I 

Registration POLICIES gA, gB,9C' 9d for the simple example. 



Policy 


Sim) 


P[Ri\Pi\ 


P [N2 = 2|Pf nPaJ 


A 


(0,0,0,0,0) 





0.4 


B 


(0, 1,0,0,0) 


0.4 





C 


(0,0,0,1,0) 


0.6 





D 


(0, 1,0, 1,0) 


1 






Proposition 5.1: The pair of RCLs (fc,9c) is individually 
optimal, but not jointly optimal. 

Proof: The paging RCL fc is optimal for the registration 
RCL gc because for gc fixed, it is equivalent to the optimal 
feedback mapping /*. Suppose then that the MS uses the 
paging RCL fc- Note that if the MS does not report at time 
t ~ 1, and if it is paged at time t — 2, the network will 
page cell ci first. Hence, if the MS enters state 3 at time 
t — 1 and if it is not paged at t = 1, then by registering for 
cost TZ it can avoid the two or more pages required at time 
t = 2 in case of a page at < = 2. Since TZ < XpVjS, it is 
optimal to have the MS register at t = 1 in this situation. 
Thus gc is optimal for fc, so the pair is individually optimal. 
However, C{fc,gc) > C{fB,gB), so that (fc,9c) is not 
jointly optimal. ■ 



We first describe the jointly optimal pair of paging and 
registration policies. We consider, without loss of optimality, 
policies given by feedback control laws {f,g) as described in 
Section Uni Thus we take u{t) = f{w{t — 1)) and v{t) = 
g{w{t — 1)). Due to the special structure of this example, the 
process w{t) takes values in a set of at most seven states, and 
the possible transitions are shown in Figure |6l The dynamic 
programming problem for jointly optimal policies thus reduces 
to a finite state problem. The optimal choice of the mapping / 
is given by f*{w), which pages states in decreasing order of 
wP. It remains to find the optimal registration policy mapping 
9- 

We claim that if t mod 3 = or i mod 3 = 2, then it is 
optimal to not register at time t. Indeed, if t mod 3 — then 
the network already knows the MS is in state 0, so registration 
would cost TZ and provide no benefit. If t mod 3 = 2, then the 
network knows that the MS will be in state at time t + 1, 
which is the next time of a potential page. Thus, again the 
registration at time t would cost TZ and provide no benefit. 
This proves the claim. 

Therefore, it remains to find the optimal registration vector 
v{t) to use when t mod 3=1. Such vector is deterministic, 
given by g{6{0)). There are essentially only four possible 
choices for g{6{0)), as indicated in Table |I] 

The cost for any pair {f,g) is given by 

- TZPil~\,)P[R^\Pn 

XpTjlAp + f3^+p^{l- Ap)P [N2 = 2|Pf n P2] + 13^) 
1 - fi^ 

Consulting Table|T]we thus find that (/*, g^) is jointly optimal 
if 7^ > XpVP, and (/*, gg) is joindy optimal if 7^ < ApT'/?. 

For the remainder of this example we consider policies 
given by RCLs. Under the assumption that < TZ < XpT'P, 
the pair of mappings (/*, gs) is equivalent to a pair of RCLs, 
which we denote by (/s, gs)- Under gs, the MS registers only 
after entering state 1 and not being paged. The pair i f 3,93) 
is jointly optimal, and hence it is also individually optimal. 
Similarly, let {fc,9c) RCLs corresponding to the feedback 
mappings ( f*,gc)- In particular, an MS using registration RCL 
gc registers only after entering state 3 and not being paged. 



VI. Jointly Optimal Policies for Some Random 
Walk Models 

The structure of jointly optimal paging and registration 
policies are identified in this section for three random walk 
models of motion. The first is a discrete state one-dimensional 
random walk, the second is for a symmetric random walk in 
for any d>l, and the third is for a Gaussian random walk 
in for any d > 1. 

A. Symmetric random walk in Z 

Suppose the motion of the MS is modeled by a discrete-time 
random walk on an infinite linear array of cells, such that the 
displacement of the walk at each step has some probability 
distribution b. Equivalently, {X{t) : t > 0) is a discrete 
time Markov process on Z with one-step transition probability 
matrix P given by pij = bj-i. For any probability distribution 
w, wP = u; * &. It is assumed that bi is a nonincreasing 
function of |i|, or in other words, b is symmetric about zero and 
unimodal. In the general form of our model, multiple states 
can correspond to the same cell, but for this example, each 
integer state i corresponds to a distinct cell in which the MS 
can be paged. So C = 5 = Z. It is assumed that the network 
knows the initial state xq. 

Due to the translation invariance of P for this example, 
the update equations of the dynamic program are translation 
invariant, and therefore the paging and registration RCLs can 
also be taken to be translation invariant. Thus, we write the 
RCLs as / = (/(fc) : fc > 1) and g = (^(fc) : fc > 1). These 
RCLs give the control decisions if the last reported state is 
io = 0, and hence for other values of iq by translation in 
space. 

It turns out that for this example, the optimal paging policy 
is ping-pong type: cells are searched in an order of increasing 
distance from the cell in which the previous report occurred. 
The optimal registration policy is a distance threshold type: 
the mobile station registers whenever its distance from the 
previous reporting point exceeds a threshold. Specifically, only 
RCLs of the following form need to be considered. The actions 
of the policies do not depend on the time fc elapsed since 
last report, so the argument fc is suppressed. For the paging 
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policy we take the ping-pong policy, given by the RCL /* = 
(0, 1, -1, 2, -2, 3, -3, . . .)■ Thus, if the MS is to be paged 
and if it was last reported to be at state iq, then the states are 
searched in the order iq, iq + l,iQ — 1, io + 2, io ~ 2, . . .. The 
registration policy is given by the RCL = I{i>d^ or i<-d,} 
where the two distance thresholds di,dr > 1 are such that 
either di — dr or di = dr — 1. 

Proposition 6.1: There is a choice of the distance thresh- 
olds di and dr such that the ping-pong paging policy given by 
/* and the distance-threshold registration policy given by g* 
are jointly optimal. 

The related work of Madhow, Honig, and Steiglitz [15] 
finds the optimal registration policy assuming that the paging 
policy is fixed to be the ping-pong policy. Also, it is not 
difficult to show that for the distance threshold registration 
policy specified by g*, the optimal paging policy is the ping- 
pong paging policy. However, a pair of individually optimal 
RCLs may not be jointly optimal, as shown in the example of 
Section N-B\ 

The remainder of this section is devoted to the proof of 
Proposition 16.11 The following notation is standard in the 
theory of majorization [16]. Given x — {xi,X2t--- ,Xn) £ 
M", let xi denote the nonincreasing rearrangement of x. 
That is, xi = {x[i], X[2], ■ ' ' j^ln])^ where the coordi- 
nates X[i], X[2], ■ ■ ■ ,X[n] are equal to a rearrangement of 
such that a;[i] > xpj > ■•• > X[„]. Given 
two vectors x and y, we say that y majorizes x, denoted by 
X ~< y, if the following conditions hold: 



i=l 



i=l 



< 



Ey[»: 

1=1 

n 



for 1 < r < n — 1 



Write X = y to denote that both x ^ y and y -< x, meaning 
that 2/ is a rearrangement of x. The relation x -< y can be 
defined in a similar fashion, in case x and y are nonnegative, 
summable functions defined on some countably infinite dis- 
crete set. In such case, denotes the i*'* coordinate, when 
the coordinates of x are listed in a nonincreasing order. 

Given a probability distribution /x on Z, let s(/i) denote the 
mean number of states that must be searched to find the MS, 
given that the MS has distribution /i and the optimal search 
order for ji is used. The optimal search order is maximum 
likelihood search [19], under which states are searched in order 
of decreasing probability. Summation by parts yields 



1=1 



= i+E(i-E^w)' 



which immediately implies the following lemma. 

Lemma 6.1: If /i and v are probability distributions such 
that n < V, then s(/x) > s{u). 

A function or probabiUty distribution /i on Z is said to be 
neat if > > fJL-i > H2 > A*-2 > . • .. 

Lemma 6.2: If is a neat probability distribution, then the 
convolution /i * 6 is neat. 

Proof: For i > 0, let 6'*^ denote the uniform probability 
distribution over the interval of integers [— i, «]. The conclusion 



is easy to verify in case b has the form 6^*) for some i. In 
general, 6 is a convex combination of such &(*)'s, and then 
is a convex combination of the functions * h^^\ using the 
same coefficients. Convex combinations of neat distributions 
are neat, so /i * 6 is indeed neat. ■ 

Lemma 6.3: If fi and i/ are probability distributions such 
that fjL < V and v is neat, then fi * b ^ * b. 
The proof of the Lemma |673l is placed in the appendix because 
the proof is specific to the discrete state setting. Lemma 16.71 
in the next subsection is similar, and its proof shows the 
connection to Riesz's rearrangement inequality. 

Let /i be a probability distribution on Z and let < A < 1. 
Let T(/x, A) be the set of probability distributions on Z such 
that (1 — X)h' < /I, pointwise. Intuitively, such a is obtained 
from fi by trimming away from /i probability mass A and 
renormalizing the remaining mass. The following lemma has 
an easy proof which is left to the reader Roughly speaking, 
the lemma means that given fi and A, the most maximal 
distribution in T{iJ,,X), in the majorization order, is obtained 
by trimming mass from the smallest /i/s. 

Lemma 6.4: (Optimality of minimum likelihood trimming) 
There exists u G A) such that for some fc > 1, 



if j > k. 

Furthermore, for any other i/' e A), i^' -< v. 

Let / and g be RCLs (possibly dependent on the elapsed 
time k since last report). The cost C{f,g) can be computed 
by considering the process only up until the first time r that 
a report occurs (i.e. one reporting cycle). Let a{k) = P[t — 
k] = ap{k) + ar{k), where ap{k) is the probability t ^ k and 
the first report is a page, and ar{k) is the probability t ~ k 
and the first report is a registration. Also let w{k) denote the 
conditional distribution of the MS given that no report occurs 
up to time k for the pair of RCLs (/, g). Then 



1 ~ 

EZi P''{rap{k)s{w{k -^l)*b)+ narjk)} 

i-Er=i/3Mfc) 



Note that the cost depends entirely on the a's and on the mean 
numbers of pages required, given by the terms s{w{k— 1) 

Lemma 6.5: (Optimality of ping-pong paging /*) There 
exists a registration RCL g° so that C{f*,g°) < C{f,g). 

Proof: Take the registration RCL 17° to be of distance 
threshold type with time varying thresholds and possibly with 
randomization at the left threshold if the thresholds are equal, 
or at the right threshold if the right threshold is one larger 
than the left threshold. More precisely, for fixed fc: all the 
values gf(fc) are binary except for possibly one value of I, 
and 1 — g°{k) is neat. Select the thresholds and randomization 
parameter so that the a's, a^'s, and a^'s are the same for the 
pair {f*,g") as for the originally given pair {f,g)- 

Let {'W°{k) ■.k>0) and t° be defined for {f*,g°) just as 
(w(fc) : fc > 0) and t are defined for {f,g). To complete the 
proof of the lemma it remains to show that s{w° (fc — 1) * 6) < 
s{w{k~l)*b) for fc > 1. The sequences w and w° are updated 
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in similar ways, by Lemma [3711 

w{k) = $(w(fc - l),g(fc)) l<fc<T-l 
w°(fc) = $(w°(fc- l),5°(fc)) l<fc<T-l 

By the definition of $, this means the distribution w(k) 
is obtained by first forming the convolution w{k — 1) * b, 
removing a fraction gi{k) of the mass at each location I, and 
renormalizing to obtain a probability distribution. The RLC g° 
trims mass in a minimum likelihood fashion. Thus, Lemmas 
12211631 and|631to show by induction that for all fc > 1: w°{k) 
is neat, w{k) -< w°{k), and w{k — 1) * 6 ^ w°{k — 1) * 6. Thus 
by Lemma |6T1 s{w°{k—l)*h) < s{w{k — l)*b), completing 
the proof of Lemma 16.51 ■ 
Proof of Proposition \6.1\ In view of Lemma |631 it remains 
to show that if the ping-pong paging policy specified by /* 
is used, then for some choice of fixed distance thresholds di 
and dr, the registration policy specified by g* is optimal. This 
can be done by examining a dynamic program for the optimal 
registration policy, under the assumption that the RCL /* is 
used. Let Vn{j) denote the mean discounted cost for n time 
steps to go, given that the mobile is located directed distance 
j from its last reported state. Then 

K+i(j) = /3^fe,-,[A,(P/; + F„(0)) 

+ {l-Xp)min{Vn{l),TZ + Vnm] 

By a contraction property of these dynamic programming 
equations, the limit Vt = lim„^oo Ki exists. Argument by 
induction yields that the functions —Vn are neat, and hence 
that — is neat. By the dynamic programming principle, an 
optimal registration policy is given by the RCL g* specified 
by: 

1 if v;(0 > 7^ + v;(o) 

else 
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Since — 14 is neat, the optimal registration RCL g* has the 
required threshold type. Proposition 16.11 is proved. ■ 



B. Symmetric random walk in M 

To extend Proposition 16.11 to more than one dimension, we 
consider a continuous state mobility model, with S ~ C ~ M'', 
for an integer d > 1. Of course in practice we expect d < 3. 
A function on R"^ is said to be symmetric nonincreasing if it 
can be expressed as 0(|a;|), for some nonincreasing function 
on IR+, where |x| denotes the usual Euclidean norm of x. Let 
Xo S M"^, and let 6 be a symmetric nonincreasing probability 
density function (pdf) on M'*. The location of the MS at time 
t is assumed to be given by X{t) = Xo + J2l=i ^s, where 
the initial state Xo is known to the network, and the random 
variables Bi, B2, ■ ■ ■ are independent, with each having pdf b. 

Let denote the volume (i.e. Lebesgue measure) of 

a Borel set A C IR.. A paging order function r — {r^ : x G 
R'') is a nonnegative function on R'' such that C'^{{x : < 
7}) — 7 for 7 > 0. Thus, as 7 increases, the volume of 
the set {x : < 7} increases at unit rate. Imagine the set 



{x : < 7} increasing as 7 increases, until the MS is in the 
set. If the MS is located at x and is paged according to the 
paging order function r, then denotes the volume of the 
set searched to find x. So the paging cost is Vr^, where V is 
the cost of paging per unit volume searched. An example of 
a paging order is increasing distance search, starting at Xq, 
which corresponds to letting rx be the volume of a ball of 
radius |a; — a;o| in R''. As in the finite state model, assume the 
cost of a registration is TZ. 

Paging and registration policies u and v can be defined for 
this model just as they were for the finite state model, with 
paging order functions playing the role of paging order vectors. 
Thus, for each t > 1, u{t) — {ux{t) : x G R*^) is a paging 
order function, and v{t) = {vx{t) : X G R'^) is a [0, l]-valued 
function. In addition, translation invariant RCLs / and g can be 
defined as they were for the one-dimensional network model, 
and they determine policies u and v as follows. If the location 
of the most recent report was Xg, then Ux{t) — fx-x„ and 
Vx{t) — gx~x„- Let /* be the RCL for increasing distance 
search paging: /* is the volume of the radius |x| ball in W^. 
Let g* be the RCL for the distance threshold registration policy 
with some threshold 77: g* — Is^\x\>rj]- 

Proposition 6.2: There is a choice of the distance threshold 
r] such that /* and g* are jointly optimal. 

The proof of Proposition 16.11 can be used for the proof of 
Proposition 16.21 with symmetric nonincreasing functions on 
R'^ replacing neat probability distributions on Z. A suitable 
variation of Lemma 16.31 must be established, and we will 
show that this can be done by applying Riesz's rearrangement 
inequality. To get started, we introduce some notation from the 
theory of rearrangements of functions (similar to the notation 
in [13].) If A is a Borel subset of R'^ with C'^{A) < 00, then 
the symmetric rearrangement of A, denoted by A'^, is the open 
ball in R'^ centered at such that C^iA) = C^iA"). Given 
an integrable, nonnegative function h on R'^, its symmetric 
nonincreasing rearrangement, h'^ , is defined by 



h^ix) = 



I{h>t}'dt 



Let hi * h2 denote the convolution of functions hi and h2, 
and let (/ii,ft.2) — /jgd /ii^2 dx. A proof of the following 
celebrated inequality is given in [13]. 

Lemma 6.6: F. Riesz's rearrangement inequality[18]) If hi, 
h2, and /i3 are nonnegative functions on R'', then 
ihi,h2*h3) < {h1,h^ *h%,). 

Given two probability densities on W^, v majorizes /x, 
written ji ^ v, \f 

fj." dx< / ly" dx for all p>0. 

|a:| <p 1^1 I^P 

Equivalently, fi ^ if, for any Borel set F C R'', there is 
another Borel set F' C R'^ with £''(F) = £'^(F'), such that 



II dx < V dx. 

F Jf' 

If fi ^ then {fi°',h) < [v" ,h), for any symmetric 
nonincreasing function h. (To see this, use the fact that such 
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an ft, is a convex combination of indicator functions of balls 
centered at zero.) 

Lemma 6.7: If fi and v are probability densities such that 
fi ^ 1^, and if ly is symmetric nonincreasing , then fi*b -< v*b. 

Proof: Let F be an arbitrary Borel subset of W^. Let 
hi = ji, h2 — If, and = b. Then = ji" , /12 = Ip", 
/13 = b, and Riesz's rearrangement inequality yields (/i, Ip * 
^) < (m'^: ^-F" *^)- Since ji ^ v — v'^ and 7^?^ *6 is symmetric 
nonincreasing, [ji'^ ,Ip,y *b) < {v^If" * Combining yields 
(/i, If *b) < {ly, If" * b), or, equivalently by the symmetry of 
b, (/i * 6, /i?) < (i^ * 6, /_F"). That is, 

/ II * b dx < / v * b dx. 

J F J F" 

Since F was an arbitrary Borel subset of M'' and C'^{F) = 
C'^iF''), n*b <v*b. ■ 
Proof of Proposition \6.2\ Proposition 16.21 follows from 
Lemma and the same arguments used to prove Proposition 
16.21 The details are left to the reader ■ 



C. Gaussian random walk in 

Consider the following variation of the model of Section 
IVI-BI Let X{t) — Xo + X]l=i ^s, where the random variables 
Bs are independent with a d-dimensional Gaussian density 
with mean vector m and co variance matrix S. Given a vector 
y let Ij/Is = (y*I]~^y)~^/^. Proposition 16. 2l can be applied to 
the process with initial state S^^/^Xq and increments Bi = 
S^^/^(_Bi — to). Suppose the time of the last report was to 
and the location at that time was Xo, and suppose the MS just 
jumped to a new state at time t. Let x{t) — Xo + {t ~ to)m. 
If the MS must be paged at time t, the optimal paging policy 
is to page according to expanding ellipses of the form {x : 
\x — x{t)\Y. < p}- If the MS is not paged at time t, the optimal 
registration policy is for the MS to register if \X [t) — x{t)\Y. > 
r], for a suitable threshold 77. 

A continuous time version of this result can also be es- 
tablished, for which the motion of the MS is modeled as 
a d-dimensional Brownian motion with drift vector to and 
infinitesimal covariance matrix E. 

VII. Conclusions 

There are many avenues for future research in the area of 
paging and registration. This paper shows how the joint paging 
and registration optimization problem can be formulated as 
a dynamic programming problem with partially observed 
states. In addition, an iterative method is proposed, involving 
dynamic programming with a finite state space, in order to 
find individually optimal pairs of RCLs. While an example 
shows that, in principle, the individually optimal pairs need 
not be jointly optimal, no bounds are given on how far from 
optimal the individually optimal pairs can be. Furthermore, 
even the problem of finding individually optimal RCLs may 
be computationally prohibitive, so it may be fruitful to apply 
approximation methods such as neurodynamic programming 
[7]. This becomes especially true if the model is extended to 
handle additional features of real world paging and registration 



models, such as the use of parallel paging, overlapping reg- 
istration regions, congestion and queueing of paging requests 
for different MSs, positive probabilties of missed pages, more 
complex motion models, estimation of motion models, and so 
on. 

This paper shows that jointly optimal paging and registration 
policies for symmetric or Gaussian random walk models are 
given by nearest-location-first paging policies and distance 
threshold registration policies. It remains to be seen whether 
these policies are good ones, even if no longer optimal, when 
the assumptions of the model are violated. It also remains to 
be seen if jointly optimal policies can be identified for other 
subclasses of motion models. 

We found that majorization theory, and, in particular, Riesz's 
rearrangement inequality, are tools well suited for the study of 
a certain search algorithms with feedback. These tools may be 
more widely useful for addressing search or distributed sensing 
problems. 

Appendix 

Appendix A: On ct-algebra notation 

Some basic definitions involving cr-algebras are collected in 
this appendix. In this paper the network only observes random 
variables with finite numbers of possible outcomes, so that 
emphasis is given to conditioning with respect to finite cr- 
algebras. 

The collections of random variables considered in this 
paper are defined on some underlying probability space. A 
probability space is a triple {Q,,T,P), such that Vl is the set 
of all possible outcomes, is a cr-algebra of subsets of Q, 
(so G and !F is closed under complements and countable 
intersections) and P is a probability measure, mapping each 
element of to the interval [0, 1]. The sets in F are called 
events. A random variable X is a function on 17 which is 
T measurable, meaning that contains all sets of the form 
{lo : X{lo) < c}. In the remainder of this section, M denotes 
a cr-algebra that is a subset of T. Intuitively, M models the 
information available from some measurement: one can think 
of J\f as the set of events that can be determined to be true or 
false by the measurement. A random variable Y is said to be TV 
measurable if TV contains all sets of the form {u; : Y{uj) < c}. 
Intuitively, F is TV measurable if the information represented 
by TV determines Y. 

An atom i? of TV is a set B e TV such that if A d B and 
A e TV then either A = or A = B. Note that if C G TV and 
B is an atom of TV, then either i? C C or B C C"=. If TV is 
finite (has finite cardinality) then there is a finite set of atoms 
Bi, . . . , B„i in TV such that each element of TV is either or 
the union of one or more of the atoms. 

Given a random variable X with finite mean, one can define 
E[X\M] in a natural way. It is an TV measurable random 
variable such that E[XZ] = E[E[X\J\f]Z] for any bounded, 
TV measurable random variable Z. In particular, if A is an atom 
in TV, then £'[X|TV] is equal to E[XIa]/P[A] on the set A. 
(Any two versions of £'[X|TV| are equal with probability one.) 

Given a random variable Y , we write a{Y) as the smallest 
cr-algebra containing all sets of the form {lo dO, : < c}. 
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The notation -Ei^lF] is equivalent to E[X\a{Y)]. In case Y 
is a random variable with a finite number of possible outcomes 
{yi, ■ ■ ■ ,ym}, the (T-algebra <j{Y) is finite with atoms Bi = 
{u! : Y{uj) — Hi}, 1 < i < m. Furthermore, given a random 
variable X with finite mean, is the function on 

which is equal to ^j^g"^' ^ on Bi for 1 < i < m. 

Appendix B: Proof of Lemma [37T] 

Since all the random variables generating A/j+i have only 
finitely many possible values, the a-algebra TVt+i is finite. 
Both sides of (|2]i are A/j+i measurable, so both sides are 
constant on each atom of TVt+i- Thus, if A denotes an atom 
of 7Vf+i, each side of (|2]i can be viewed as a function of A, 
and it must be shown that the equality holds for all such A. 
Below we shall write ■wi{t + 1,A) for the value of wi{t + 1) 
on the atom A. 

Since Pt+i U Rt+i e Mt+i, it follows that either A C 
Pt+i U Rt+i or A C Pj"+i n If ^ C Pt+i U Rt+i then 

X{t+l) is determined by A, and wit + I, A) = 6{X{t+l)), 
so that ^ holds on A. So for the remainder of the proof, 
assume that A C P/+i n 

It follows that A can be expressed as ^4 = Ar\P^j^i nR^j^^ 
for some atom A of A/j. Thus for any state I 

wi(t + 1, A) - P[X{t + 1) = l\A] = ^' 
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where, letting denote the value of Wjjt) on and 

vi{t+ It A) denote the value of vi{t + 1) on A, 

Ti = P[R'i^,r^{x{t^l) = l}\Af^P^^^] 

= E[{1 -vi{t + l))I{x(t+i)=i}\A n P.'^+i] 

= E[{l-vi{t + l))I{x(,t+i)=i}\A] 

= P[X{t + l) = l\A]{l-vi{t + l,A)) 

Y,Wj{t,A)q}\ {l-Vi{t + l,Aj). 
Therefore 

wi{t + l,A) = <^i{w{t,A),v{t + l,A)), 

for any atom A of Nt+i with A C Pj+i H Lemma ITT 

is proved. 



Appendix C: Proof of Lemma lOl 

Lemma 16.31 is proved following the statement and proof of 
three lemmas. 

Lemma 1.1: Consider two monotone sequences of some 
finite length n: oi > 02 > . . . > a„ = and = 61 < 62 < 
... < 6„. Let Ci — ai + bi for \ < i < n, \tt di — ai + bi^i 
for 1 < i < n — 1, and let (i„ = 0. Then c < d. 

Proof: Note that di > Ci and di > q+i for 1 < i < n— 1, 
and the sum of the c's is equal to the sum of the d's . Therefore, 
for any subset A of {1, 2, ... , n}, there is another subset A' 
with \A\ = \A'\ such that J^ieA'^i — SieA' '^i- '^h^'- proves 
the lemma. ■ 

Lemma 1.2: Let r and L be positive integers. Consider the 
convolution F *G of two binary valued functions on Z, such 



that the support of F has cardinality r, and the support of 
G is a set of L consecutive integers. Then the convolution is 
maximal in the majorization order, if the support of P is a set 
of r consecutive integers. 

Proof: Suppose without loss of generality that G = 
I{Q<i<L-i} - If the support of F is not an interval of integers, 
let jmax be the largest integer in the support of F and let jo 
be the smallest integer such that the support of F contains the 
interval of integers [jo, jmax]- Then F = F°- + F^, such that 
F° = for i > jo — 1 and the support of F^ is the interval 
of integers [jo, jmax]- Let F' be the new function defined by 
Fl ^ F^ + Ff^^. The graph of F' is obtained by sUding the 
rightmost portion of the graph of F to the left one unit. 

We claim that F*G < F' ^G.To see this, note that F*G ^ 
F°' * G + F'^ * G. The idea of the proof is to focus on the 
interval of integers / = [jp — 1, jo+r— 2] and appeal to Lemma 
ILII The function P° * G is nonincreasing on /, it takes value 
zero at the right endpoint of /, and it is also zero everywhere 
to the right of /. The function F'' * G is nondecreasing on /, 
it takes value zero at the left endpoint of /, and it is also zero 
everywhere to the left of /. The convolution F' * G is the same 
as F * G except the second function F** * G is shifted one unit 
to the right. Lemma 11.11 thus implies that F * G < F' * G. 
This procedure can be repeated until F is reduced to a function 
with support being a set of r consecutive integers. The lemma 
is proved. ■ 

Lemma 1.3: Let r > 1 and consider the convolution F * b 
such that F is a binary valued function on the integers with 
support of cardinality r. Then the convolution is maximal 
in the majorization order if the support of F consists of r 
consecutive integers. 

Proof: For i > 0, let 6^'^ denote the uniform probability 
distribution on the interval [—i, i], of L = 2z + 1 integers. The 
lemma is true if 6 = fo*^'-* for some i by Lemma 11.21 Let F* 
denote the unique neat binary valued function with support of 
cardinality r. Note that F* * fe*-*-* is neat for alH > because 
both 6^'-' and F* are neat. In general, b can be written as 
b = X^i^o '^i^'*'' ^'^^ some probability distribution A on Z+. 
Therefore, for any binary F with support of cardinality r, 



6*F = ^A,(fe« *F) ^-^^ ^A,(6(*' *F)x 

i i 

^-5 ^A,(5«*F*)i = (6*F*)x^&*F*. 



Here (a) follows from the fact that taking nondecreasing 
rearrangements of probability distributions before adding them 
increases the sum in the majorization order, and (b) follows 
from Lemma [TT2I ■ 



Proof of Lemma W3\ Fix ?' > 1, let F range over all binary 
valued functions on Z with support of cardinality r, and let 
F* denote the unique choice of F that is neat. Use "(/i, v)" 
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to denote inner products. 



2^(/i*6)[j] — max{fi*b,F) — ma.x{fi,b * F) 

F F 

i=l 

(a) 

< maxifi I, {b*F)i) 



< ifi^,{b*F*)i) 



r 

= {v*b,F*) 



[17] J.L. Massey, "Guessing and entropy," Prof. 2004 IEEE International 

Symposium on Information Theory, p. 204, Trondheim, Norway, 1994. 
[18] F. Riesz, "Sur une inegalite integrale. / London Mathematical Society, 

vol. 5, pp. 162-168, 1930. 
[19] C. Rose and R. Yates, "Minimizing the average cost of paging under 

delay constraints," Wireless Networks, vol. 1, pp. 211-219, 1995. 
[20] C. Rose. "Minimizing the average cost of paging and registration: A 

timer-based method," Wireless Networks, vol. 2, pp. 109-116, 1996. 
[21] C. Rose, "State-based paging/registrationra greedy technique," IEEE 

Transactions on Vehicular Technology, vol. 48, pp. 166-173, January 

1999. 

[22] W. Wang, 1. F. Akyildiz. and G. L. Stuber, "Effective paging schemes 
with delay bounds as QoS constraints in wireless systems," Wireless 
Networks, vol. 7, pp. 455^66, 2001. 



Here, (a) follows from the fact that rearranging each of 
two distributions in nonincreasing order increases their inner 
product, (b) follows from Lemma [T3] and the monotonicity of 
^l, (c) follows from the fact that both v and b*F* are neat, so 
their innner product is the same as the inner product of their 
rearranged probability distributions, and (d) follows from the 
fact that * 6 is neat. ■ 
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